Skip to content

cp: disable CG for 8B SFT (2508) into r0.3.0#2513

Closed
svcnvidia-nemo-ci wants to merge 1 commit intor0.3.0from
cherry-pick-2508-r0.3.0
Closed

cp: disable CG for 8B SFT (2508) into r0.3.0#2513
svcnvidia-nemo-ci wants to merge 1 commit intor0.3.0from
cherry-pick-2508-r0.3.0

Conversation

@svcnvidia-nemo-ci
Copy link
Copy Markdown
Contributor

@svcnvidia-nemo-ci svcnvidia-nemo-ci commented Feb 24, 2026

beep boop [🤖]: Hi @malay-nagda 👋,

we've cherry picked #2508 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

  • Chores
    • Updated Llama3 8B finetune preset configurations for improved performance.

Signed-off-by: Malay Nagda <malayn@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@svcnvidia-nemo-ci
Copy link
Copy Markdown
Contributor Author

/ok to test 4666f54

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 24, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 24, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c106d06 and 4666f54.

📒 Files selected for processing (1)
  • scripts/performance/configs/llama/llama3_workload_base_configs.py

📝 Walkthrough

Walkthrough

Two Llama3 8B finetune SFT presets have their cuda_graph_impl parameter changed from "transformer_engine" to "none" with a comment indicating CUDA Graphs reduce performance in this configuration context.

Changes

Cohort / File(s) Summary
Configuration Parameter Updates
scripts/performance/configs/llama/llama3_workload_base_configs.py
Modified cuda_graph_impl from "transformer_engine" to "none" in two Llama3 8B finetune SFT presets to optimize performance based on observed behavior with CUDA Graphs.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning Performance-related change disabling CUDA Graphs lacks required before-and-after performance metrics and quantitative documentation. Add performance benchmark results comparing configurations with specific metrics (throughput, latency), test environment details, and reference to PR #2508 justification.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly indicates the primary change: disabling CG (CUDA Graphs) for 8B SFT configs in the r0.3.0 branch, which matches the actual modification of cuda_graph_impl settings.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cherry-pick-2508-r0.3.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ko3n1g
Copy link
Copy Markdown
Contributor

ko3n1g commented Mar 3, 2026

Merge via #2509

@ko3n1g ko3n1g closed this Mar 3, 2026
@malay-nagda
Copy link
Copy Markdown
Contributor

added to rel branch with #2527.

closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants